A Multi-Pronged Approach to Improving Semantic Extraction of News Video

نویسندگان

  • Alexander G. Hauptmann
  • Ming-yu Chen
  • Michael G. Christel
  • Wei-Hao Lin
  • J. Yang
چکیده

In this paper we describe a multi-strategy approach to improving semantic extraction from news video. Experiments show the value of careful parameter tuning, exploiting multiple feature sets and multilingual linguistic resources, applying text retrieval approaches for image features, and establishing synergy between multiple concepts through undirected graphical models. We present a discriminative learning framework called Multi-concept Discriminative Random Field (MDRF) for building probabilistic models of video semantic concept detectors by incorporating related concepts as well as the low-level observations. The model exploits the power of discriminative graphical models to simultaneously capture the associations of concept with observed data and the interactions between related concepts. Compared with previous methods, this model not only captures the co-occurrence between concepts but also incorporates the raw data observations into a unified framework. We also describe an approximate parameter estimation algorithm and present results obtained from the TRECVID 2006 data. No single approach, however, provides a consistently better result for all concept detection tasks, which suggests that extracting video semantics should exploit multiple resources and techniques rather than naively relying on a single approach

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating multi-modal content analysis and hyperbolic visualization for large-scale news video retrieval and exploration

In this paper, we have developed a novel scheme to achieve more effective analysis, retrieval and exploration of large-scale news video collections by performing multi-modal video content analysis and synchronization. First, automatic keyword extraction is performed on news closed captions and audio channels to detect the most interesting news topics (i.e., keywords for news topic interpretatio...

متن کامل

Accessing Video Contents: Cooperative Approach between Image and Natural Language Processing

Digital video libraries become much more important. In achieving them, access and extraction methods of semantic contents of videos are essential technologies. The paper demonstrates the benefits of multi-modal video analysis to extract semantic contents of videos. Two systems, Name-It and Spot-It, are introduced as example systems taking this approach. Name-It detects faces in news vidcos and ...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Intelligent Multimedia Indexing and Retrieval through Multi-source Information Extraction and Merging

This paper reports work on automated meta-data creation for multimedia content. The approach results in the generation of a conceptual index of the content which may then be searched via semantic categories instead of keywords. The novelty of the work is to exploit multiple sources of information relating to video content (in this case the rich range of sources covering important sports events)...

متن کامل

Name-It: Naming and Detecting Faces in News Video

We have developed Name-It, a system that associates faces and names in news videos. The system is given news videos, which include image sequences and transcripts obtained from audio tracks or closed caption texts. The system can then either infer possible name candidates for a given face, or locate a face in news videos by name. To accomplish this task, the system takes a multi-modal video ana...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Signal Processing Systems

دوره 58  شماره 

صفحات  -

تاریخ انتشار 2010